1 In this vignette

In this cell annotation vignette, we will be taking the output from the Data Integration vignette and identifying cell types and subtypes by expression of key marker genes. Herein, we will do the following:

  1. Perform low-resolution clustering
  2. Identify cell types using differentially expressed genes.
  3. Recluster each cell type at higher resolution.
  4. Identify cell subtypes within broad cell type clusters using differentially expressed genes.
  5. Run SingleR to help confirm annotations.

2 Prepare the R environment

2.1 Load in the necessary libraries

library(Seurat)
library(ggplot2)
library(dplyr)
library(cowplot)
library(SeuratWrappers)
library(plotly)
library(RColorBrewer)
library(clustree)
library(SingleR)
library(formattable)

2.2 Load in our Seurat object

This .rds file comes from Vignette #2 - Data Integration.

data.integrated <- readRDS("../2_DataIntegration_Vignette/2_DataIntegration.rds")

3 Clustering and visualization

Seurat uses K-nearest neighbor clustering based on PCA and then clusters the cells using the Louvain algorithm. Nearest neighbor clustering requires input of dimensions. Based on the JackStrawPlot and ElbowPlot, we will use all 50 PCs and could likely use even more. However, it is noted that there are diminishing returns on the number of PCs, so 50 will be sufficient.

data.integrated <- FindNeighbors(data.integrated, dims=1:50) 
## Computing nearest neighbor graph
## Computing SNN

3.1 Clustree

Resolution of the Louvain clustering algorithm greatly affects number of identified clusters. This can be visualized by iterating over multiple resolutions and then plotting using Clustree. The resulting clustering is stored in the metaData of our integrated Seurat object.

#Calculate clusters for a number of resolutions ranging from 0.2-1.0
res <- c(0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0)
data.integrated <- FindClusters(data.integrated, resolution=res)
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9627
## Number of communities: 20
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9541
## Number of communities: 25
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9476
## Number of communities: 28
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9418
## Number of communities: 31
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9371
## Number of communities: 31
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9326
## Number of communities: 33
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9283
## Number of communities: 37
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9241
## Number of communities: 39
## Elapsed time: 6 seconds
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9201
## Number of communities: 39
## Elapsed time: 6 seconds
#Plot using Clustree
clustree(data.integrated, prefix="integrated_snn_res.")
## Warning: The `add` argument of `group_by()` is deprecated as of dplyr 1.0.0.
## Please use the `.add` argument instead.

#Recluster at the selected resolution of interest. This is not absolutely necessary, but is a nice redundancy and resets the `seurat_clusters` column in the metadata automatically.
data.integrated <- FindClusters(data.integrated, resolution=0.4)
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 33322
## Number of edges: 1402393
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9476
## Number of communities: 28
## Elapsed time: 5 seconds

4 SingleR

The R package SingleR uses microarray reference datasets to attempt to annotate each individual cell. We will use this package as a tool to support cluster identification. https://bioconductor.org/packages/release/bioc/html/SingleR.html References for SingleR are provided by the pokedex for cells celldex. In this case, we will run SinglerR with two different databases: 1) Immgen and 2) MouseRNAseq from the gene expression omnibus.

#Warning, this takes a very long time to run and scales poorly with number of cells, but is not computationally expensive (can run with 8 gb of RAM.
#Load in the database information
ImmGen <- ImmGenData() 
## Warning: 'ImmGenData' is deprecated.
## Use 'celldex::ImmGenData' instead.
## See help("Deprecated")
## snapshotDate(): 2021-05-18
## see ?celldex and browseVignettes('celldex') for documentation
## loading from cache
## see ?celldex and browseVignettes('celldex') for documentation
## loading from cache
MouseRNAseq <- MouseRNAseqData()
## snapshotDate(): 2021-05-18
## see ?celldex and browseVignettes('celldex') for documentation
## loading from cache
## see ?celldex and browseVignettes('celldex') for documentation
## loading from cache
#SingleR requires conversion to a single cell experiment object
singleR <- as.SingleCellExperiment(data.integrated, assay= "integrated") 
#Create main labels using the Immgen database
singler.ImmGen <- SingleR(test = singleR, ref = ImmGen, labels = (ImmGen$label.main))
#Create main labels using the Immgen database
singler.MouseRNAseq <- SingleR(test = singleR, ref = MouseRNAseq, labels = (MouseRNAseq$label.main))
#Save the singleR labels into our Seurat object as a metaData column
data.integrated[["ImmGen"]] <- singler.ImmGen$labels 
data.integrated[["MouseRNAseqdb"]] <- singler.MouseRNAseq$labels
#Plot the main labels and compare them to the original Seurat clusters
SR1 <- DimPlot(data.integrated, label=T, repel=T) + NoLegend()
SR2 <- DimPlot(data.integrated, label=T, repel=T, group.by="ImmGen") + NoLegend()
SR3 <- DimPlot(data.integrated, label=T, repel=T, group.by="MouseRNAseqdb") + NoLegend()
SR1 + SR2 + SR3
## Warning: ggrepel: 8 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

5 Low-level Cluster Identification

Seurat has a built in function for differential expression across identities. In this case, our identities are currently numbered clusters and setting FindAllMarkers to report only genes expression in at least 50% of cells (for a given cluster) and with a positive fold change will give us markers that are highly expressed within each cluster. The max.cells.per.ident option limits the amount of cells included to reduce computational time.

lowlevel.markers <- FindAllMarkers(data.integrated, only.pos = TRUE, min.pct = 0.50, logfc.threshold = 0.25, max.cells.per.ident=1000, verbose=FALSE)

#Store the top10 markers for each cluster and write it to a table
top10 <- lowlevel.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC)

#Put the top10 markers for each cluster into a formatted table
top10 %>% formattable()
p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
2.204445e-287 3.2733807 0.999 0.673 4.408890e-284 0 Selenop
3.765615e-278 2.7684359 0.999 0.557 7.531230e-275 0 C1qc
1.121794e-273 2.8516298 0.999 0.489 2.243588e-270 0 C1qa
2.390962e-265 2.5646835 0.999 0.554 4.781924e-262 0 C1qb
2.368984e-257 2.8163310 0.999 0.575 4.737969e-254 0 Apoe
2.370869e-255 2.9843800 0.996 0.450 4.741738e-252 0 Pf4
5.702047e-240 2.4362855 0.998 0.593 1.140409e-236 0 Pltp
1.357335e-207 2.8264101 0.900 0.366 2.714670e-204 0 Cbr2
3.900404e-180 2.3649790 0.893 0.344 7.800808e-177 0 Folr2
1.925071e-111 2.6400285 0.815 0.456 3.850142e-108 0 Ccl8
8.220399e-192 1.1934868 0.846 0.190 1.644080e-188 1 Igkv8-30
5.682940e-43 0.7535382 0.344 0.551 1.136588e-39 1 Gm26917
1.374123e-38 0.7733946 0.711 0.420 2.748246e-35 1 Atp6v0d2
7.986853e-38 1.0257686 0.686 0.209 1.597371e-34 1 Igkv3-2
2.984435e-29 1.5775792 0.615 0.147 5.968870e-26 1 Igkv19-93
1.624575e-26 0.8348826 0.799 0.624 3.249150e-23 1 Fabp5
2.629720e-20 0.7111138 0.935 0.911 5.259439e-17 1 Prdx1
4.651868e-10 1.6805620 0.533 0.154 9.303735e-07 1 Cma1
9.682367e-06 1.2939057 0.523 0.153 1.936473e-02 1 Tpsb2
6.123045e-05 0.6727795 0.653 0.529 1.224609e-01 1 Pla2g7
1.788606e-183 1.9390654 0.998 0.700 3.577212e-180 2 Lyz2
1.124362e-176 1.7067890 0.944 0.546 2.248725e-173 2 Lst1
1.593060e-167 1.7966249 0.963 0.592 3.186119e-164 2 Ms4a6c
9.448228e-155 2.0972228 0.989 0.685 1.889646e-151 2 Ifitm3
5.375437e-148 1.7122860 0.982 0.823 1.075087e-144 2 Prdx5
1.391176e-120 1.5645674 0.760 0.343 2.782352e-117 2 Sirpb1c
1.006656e-99 1.9192618 0.874 0.526 2.013311e-96 2 Il1b
6.440672e-66 2.0975351 0.770 0.517 1.288134e-62 2 Plac8
7.910219e-32 1.8633784 0.704 0.510 1.582044e-28 2 Cxcl9
2.885236e-07 1.6500587 0.601 0.527 5.770473e-04 2 Cxcl10
0.000000e+00 4.7799811 0.998 0.414 0.000000e+00 3 Cd79a
1.093237e-301 3.7429075 0.983 0.415 2.186474e-298 3 Cd79b
2.767634e-297 4.3215389 0.984 0.380 5.535269e-294 3 Ly6d
2.717702e-281 3.3098711 0.945 0.257 5.435404e-278 3 Ms4a1
3.888374e-254 2.9318569 0.944 0.454 7.776747e-251 3 H2-DMb2
1.643444e-203 2.6502791 0.874 0.306 3.286888e-200 3 Iglc3
7.823993e-198 2.4734851 0.896 0.318 1.564799e-194 3 Iglc2
2.140715e-139 2.2579382 0.803 0.299 4.281431e-136 3 Ebf1
9.989603e-134 2.4485614 0.763 0.239 1.997921e-130 3 Fcmr
7.000078e-127 2.3399668 0.745 0.286 1.400016e-123 3 Ighd
1.488688e-274 2.8740327 0.996 0.581 2.977376e-271 4 Lpl
8.837480e-273 2.7445951 1.000 0.806 1.767496e-269 4 Lgals3
8.081045e-247 2.5382207 0.984 0.515 1.616209e-243 4 Trem2
3.339546e-240 2.2143654 0.992 0.576 6.679092e-237 4 Cd36
6.545996e-230 2.2898030 0.998 0.789 1.309199e-226 4 Ctsd
7.804839e-205 2.1610881 0.990 0.737 1.560968e-201 4 Anxa1
2.704274e-159 2.2922252 0.828 0.441 5.408547e-156 4 Gdf15
1.055975e-143 2.8297315 0.802 0.318 2.111949e-140 4 Gpnmb
2.605881e-74 2.7871300 0.672 0.255 5.211762e-71 4 Mmp12
3.909452e-68 2.2310279 0.724 0.343 7.818904e-65 4 Ctsk
0.000000e+00 3.4526084 1.000 0.884 0.000000e+00 5 Cst3
2.335810e-302 2.8325288 0.991 0.532 4.671620e-299 5 Naaa
3.564827e-300 2.2024370 0.978 0.342 7.129654e-297 5 Clec9a
1.906393e-289 2.6807061 0.994 0.668 3.812787e-286 5 Irf8
1.722317e-276 2.5832926 0.993 0.595 3.444635e-273 5 Plbd1
2.004989e-273 2.4032388 0.981 0.555 4.009979e-270 5 Ppt1
7.912738e-266 2.0693468 0.986 0.459 1.582548e-262 5 Wdfy4
2.021446e-264 2.1154419 0.990 0.644 4.042891e-261 5 Gm2a
2.608526e-244 2.1333803 0.976 0.428 5.217053e-241 5 Cd24a
1.311438e-231 2.1622829 0.932 0.414 2.622876e-228 5 Rtl8c
1.162098e-173 1.4567935 0.998 0.827 2.324195e-170 6 H2-Ab1
1.176692e-156 1.4028706 0.991 0.801 2.353385e-153 6 Lsp1
2.162521e-148 2.0932829 0.841 0.491 4.325041e-145 6 Tnfsf9
4.545976e-128 1.3649905 0.953 0.611 9.091952e-125 6 Napsa
1.632052e-127 1.9768721 0.898 0.536 3.264103e-124 6 Il1b
3.037064e-119 1.5137601 0.832 0.478 6.074129e-116 6 Tnip3
2.708316e-92 2.6750790 0.714 0.281 5.416632e-89 6 Cd209a
4.725833e-63 1.9469267 0.670 0.373 9.451666e-60 6 Ear2
5.028765e-56 1.8900960 0.713 0.401 1.005753e-52 6 Fn1
7.584320e-08 1.7143310 0.638 0.516 1.516864e-04 6 Lyz1
1.720520e-247 2.5871026 0.979 0.496 3.441041e-244 7 Cd3g
2.782103e-234 2.4668968 0.971 0.491 5.564206e-231 7 Cd3e
3.610375e-230 2.3306726 0.964 0.536 7.220750e-227 7 Cd3d
1.725840e-200 3.2723736 0.853 0.376 3.451680e-197 7 Tnfrsf4
7.987865e-172 2.7694239 0.869 0.509 1.597573e-168 7 Ctla4
3.266643e-167 2.0673723 0.979 0.781 6.533287e-164 7 AW112010
1.749500e-166 2.1100529 0.830 0.426 3.499000e-163 7 Icos
1.180868e-163 2.3089053 0.862 0.436 2.361736e-160 7 Tnfrsf18
1.509837e-133 2.0975406 0.750 0.393 3.019673e-130 7 Ikzf2
1.815537e-46 3.1063654 0.623 0.432 3.631075e-43 7 Areg
1.965096e-276 3.1430577 0.924 0.197 3.930193e-273 8 Ncr1
2.967364e-274 3.1188238 0.999 0.562 5.934728e-271 8 Nkg7
3.570166e-270 2.9675779 0.999 0.780 7.140332e-267 8 AW112010
1.611834e-263 2.8419647 0.964 0.481 3.223669e-260 8 Ctsw
5.417348e-263 3.6783132 0.918 0.297 1.083470e-259 8 Xcl1
7.797901e-263 2.8975007 0.969 0.508 1.559580e-259 8 Il2rb
2.245034e-232 2.9497521 0.890 0.361 4.490069e-229 8 Klrb1c
4.143432e-213 2.9107101 0.822 0.254 8.286865e-210 8 Prf1
9.288241e-207 2.6199757 0.890 0.415 1.857648e-203 8 Cd7
3.627012e-168 4.7894160 0.753 0.260 7.254025e-165 8 Gzma
0.000000e+00 5.8378008 0.992 0.178 0.000000e+00 9 Dcn
0.000000e+00 5.7040182 0.997 0.551 0.000000e+00 9 Gsn
0.000000e+00 4.2223587 0.982 0.226 0.000000e+00 9 Serping1
1.666293e-303 3.7954575 0.990 0.392 3.332585e-300 9 C3
5.875800e-292 3.5595392 0.967 0.309 1.175160e-288 9 Gpx3
8.758582e-287 4.5248522 0.961 0.211 1.751716e-283 9 Clec3b
4.615889e-282 4.0279987 0.966 0.352 9.231778e-279 9 Igfbp7
3.894426e-261 3.9531362 0.977 0.544 7.788852e-258 9 Cebpd
1.849781e-242 3.2862205 0.952 0.528 3.699563e-239 9 Id3
1.164315e-216 3.8360513 0.906 0.253 2.328629e-213 9 Sparc
1.094560e-216 2.0657181 0.975 0.538 2.189120e-213 10 Cd3d
1.862478e-203 2.6896993 0.936 0.569 3.724956e-200 10 Ly6c2
2.676799e-202 2.0825434 0.968 0.576 5.353597e-199 10 Ms4a4b
8.677553e-190 1.7184548 0.959 0.494 1.735511e-186 10 Cd3e
6.734735e-184 1.7006030 0.947 0.512 1.346947e-180 10 Lck
2.897437e-178 1.7971217 0.844 0.343 5.794874e-175 10 Tcf7
1.365335e-174 1.7974966 0.947 0.504 2.730669e-171 10 Thy1
1.713206e-173 1.7562783 0.956 0.597 3.426411e-170 10 Ccl5
2.020857e-125 1.6632964 0.854 0.439 4.041715e-122 10 Trbc2
4.271217e-67 1.7389740 0.719 0.424 8.542434e-64 10 Cd7
1.366457e-298 4.3755646 0.998 0.597 2.732913e-295 11 Ccl5
1.485543e-280 3.4411066 0.997 0.564 2.971087e-277 11 Nkg7
5.418790e-266 2.9405115 0.990 0.577 1.083758e-262 11 Ms4a4b
1.178276e-264 3.7467698 0.926 0.349 2.356552e-261 11 Cd8b1
6.478566e-248 2.6919827 0.974 0.496 1.295713e-244 11 Cd3e
7.310302e-241 3.3268773 0.894 0.365 1.462060e-237 11 Gzmk
9.187769e-241 2.6516882 0.973 0.540 1.837554e-237 11 Cd3d
4.061041e-240 2.6285212 0.974 0.500 8.122083e-237 11 Cd3g
3.707064e-232 2.4957990 0.939 0.506 7.414129e-229 11 Lat
9.254575e-210 2.8987733 0.858 0.369 1.850915e-206 11 Cd8a
5.644190e-259 2.5800321 0.981 0.484 1.128838e-255 12 Stmn1
4.376341e-239 2.1614168 0.938 0.330 8.752681e-236 12 Birc5
6.070325e-225 2.1119797 0.996 0.786 1.214065e-221 12 Tuba1b
9.326214e-218 1.6477712 0.918 0.370 1.865243e-214 12 Tk1
8.873670e-214 2.0386987 0.995 0.802 1.774734e-210 12 Tubb5
1.304747e-211 1.7381359 0.908 0.352 2.609494e-208 12 Pclaf
5.460677e-191 1.6506697 0.894 0.459 1.092135e-187 12 Top2a
2.283680e-133 2.2375279 0.810 0.359 4.567361e-130 12 Ube2c
4.725809e-127 1.6162227 0.972 0.709 9.451619e-124 12 Hmgb2
2.958507e-74 2.3722323 0.775 0.439 5.917014e-71 12 Hist1h2ap
4.566835e-245 3.2578885 0.994 0.387 9.133670e-242 13 Cxcr6
5.050357e-216 2.7551170 0.990 0.510 1.010071e-212 13 Cd3g
1.310833e-211 2.8460121 0.971 0.464 2.621666e-208 13 Il7r
3.360611e-203 2.7336304 0.904 0.358 6.721223e-200 13 Actn2
4.060657e-198 2.5770626 0.956 0.385 8.121313e-195 13 Ikzf3
4.355333e-196 3.1322533 0.847 0.160 8.710665e-193 13 Tcrg-V6
2.450697e-195 3.7681954 0.854 0.186 4.901394e-192 13 Cd163l1
1.562943e-193 4.6566444 0.872 0.278 3.125886e-190 13 Trdv4
4.258710e-162 2.6505595 0.815 0.169 8.517419e-159 13 Trdc
8.095549e-129 2.5397097 0.850 0.565 1.619110e-125 13 Rgcc
2.234809e-199 4.2652511 0.939 0.432 4.469618e-196 14 Areg
2.211420e-198 4.1322656 0.970 0.565 4.422840e-195 14 Hilpda
1.028540e-186 2.7379575 0.901 0.266 2.057080e-183 14 Klrg1
2.303502e-178 2.5981343 0.937 0.389 4.607004e-175 14 Cxcr6
3.569811e-171 2.6896374 0.912 0.468 7.139621e-168 14 Itk
2.511854e-167 2.9545727 0.841 0.204 5.023707e-164 14 Ccr8
2.680122e-116 2.6872598 0.752 0.232 5.360245e-113 14 Arg1
5.370727e-105 2.4260294 0.805 0.447 1.074145e-101 14 Gata3
1.291962e-75 2.4591385 0.778 0.517 2.583925e-72 14 Lmo4
1.367314e-14 2.9534890 0.538 0.287 2.734628e-11 14 Csf2
1.430433e-238 4.5889341 0.995 0.413 2.860865e-235 15 Ccr7
2.511297e-209 3.4525616 0.973 0.599 5.022593e-206 15 Tmem123
3.712795e-203 3.1701204 0.954 0.448 7.425589e-200 15 Il4i1
2.773674e-199 4.7060784 0.921 0.323 5.547349e-196 15 Fscn1
5.802395e-163 2.8170346 0.938 0.633 1.160479e-159 15 Tspan3
1.106407e-133 3.1242161 0.818 0.376 2.212815e-130 15 Ccl22
1.759331e-119 2.9074417 0.728 0.148 3.518662e-116 15 Apol7c
1.344000e-105 3.1061386 0.875 0.635 2.688000e-102 15 Fabp5
1.479372e-102 3.0081776 0.695 0.201 2.958744e-99 15 Il12b
3.720049e-99 3.0570300 0.909 0.663 7.440098e-96 15 Epsti1
4.419900e-202 3.8196518 0.996 0.557 8.839799e-199 16 Gngt2
3.951499e-178 3.0711655 0.998 0.599 7.902998e-175 16 Msrb1
4.814620e-178 2.8580055 0.987 0.575 9.629240e-175 16 Lst1
9.129207e-156 3.1681137 0.953 0.382 1.825841e-152 16 Ear2
2.517856e-150 3.0161141 0.919 0.354 5.035713e-147 16 Ace
4.473623e-139 2.6125079 0.927 0.385 8.947246e-136 16 Ifitm6
4.375031e-137 2.8106390 0.942 0.516 8.750062e-134 16 Pglyrp1
7.072014e-133 2.6903165 0.955 0.533 1.414403e-129 16 Plac8
5.226081e-125 2.6562942 0.870 0.255 1.045216e-121 16 Treml4
1.674944e-96 2.5439001 0.831 0.396 3.349887e-93 16 Eno3
1.940061e-195 6.8935106 0.995 0.290 3.880122e-192 17 Jchain
8.589551e-175 5.9145585 0.980 0.501 1.717910e-171 17 Igkc
3.761565e-165 7.0090309 0.933 0.227 7.523129e-162 17 Iglv1
1.012652e-160 6.2678339 0.968 0.653 2.025305e-157 17 Ighm
5.530097e-123 6.4162187 0.714 0.149 1.106019e-119 17 Igkv3-5
1.655636e-92 7.4179285 0.724 0.246 3.311271e-89 17 Igkv3-2
1.146709e-73 6.3638962 0.635 0.267 2.293419e-70 17 Igkv3-10
4.103861e-57 5.5773850 0.510 0.049 8.207721e-54 17 Igkv1-117
1.492014e-44 6.3462965 0.562 0.238 2.984028e-41 17 Igkv17-121
4.037721e-34 5.5177794 0.502 0.190 8.075442e-31 17 Igkv2-109
4.245031e-153 3.3423899 1.000 0.713 8.490063e-150 18 Hmgb2
2.038511e-137 2.5708947 0.976 0.493 4.077021e-134 18 Stmn1
1.453815e-136 2.2388824 0.958 0.453 2.907630e-133 18 Mki67
3.322470e-132 2.8725476 0.929 0.361 6.644939e-129 18 Pclaf
5.921607e-129 2.2405000 0.961 0.528 1.184321e-125 18 Dut
2.564572e-118 2.3557514 0.908 0.341 5.129145e-115 18 Birc5
1.353960e-116 2.4305848 0.911 0.467 2.707921e-113 18 Top2a
1.368116e-114 2.7088852 0.887 0.366 2.736231e-111 18 Ube2c
1.602869e-91 2.3583983 0.887 0.505 3.205738e-88 18 H2afx
3.627306e-79 3.5725562 0.813 0.444 7.254612e-76 18 Hist1h2ap
5.428585e-172 7.3256614 1.000 0.103 1.085717e-168 19 Mcpt4
2.611234e-164 7.2632147 1.000 0.179 5.222468e-161 19 Tpsb2
1.559864e-159 7.2422849 1.000 0.181 3.119727e-156 19 Cma1
5.528504e-158 6.1625705 0.996 0.140 1.105701e-154 19 Cpa3
1.008235e-136 4.0619822 0.923 0.104 2.016471e-133 19 Fcer1a
3.213310e-127 5.0901896 0.944 0.290 6.426620e-124 19 Hdc
1.151088e-124 3.9888072 0.986 0.713 2.302177e-121 19 Jun
6.336756e-120 4.0545302 0.961 0.502 1.267351e-116 19 Ccl2
3.642700e-119 4.7164455 0.937 0.456 7.285399e-116 19 Serpinb1a
1.354505e-35 4.1755179 0.697 0.408 2.709009e-32 19 Ccl7
1.087603e-130 2.8502137 1.000 0.714 2.175205e-127 20 Hmgb2
4.217440e-121 2.7397130 0.996 0.505 8.434880e-118 20 H2afx
2.511728e-119 2.3140089 0.996 0.554 5.023457e-116 20 Naaa
1.193647e-117 2.3892104 1.000 0.890 2.387295e-114 20 Cst3
6.273036e-114 2.2351330 0.993 0.684 1.254607e-110 20 Irf8
2.419714e-104 2.3087095 0.937 0.362 4.839428e-101 20 Pclaf
6.290424e-103 2.2119185 0.996 0.806 1.258085e-99 20 Tubb5
1.345089e-102 2.3637218 0.937 0.367 2.690178e-99 20 Ube2c
6.894258e-95 2.1613241 0.993 0.790 1.378852e-91 20 Tuba1b
1.737667e-88 2.7846271 0.914 0.444 3.475333e-85 20 Hist1h2ap
4.559889e-163 8.1814325 0.996 0.202 9.119779e-160 21 Ighv11-2
1.100138e-156 9.1700803 1.000 0.291 2.200276e-153 21 Igkv14-126
8.035587e-141 4.7798094 1.000 0.293 1.607117e-137 21 Jchain
1.119871e-124 4.7006576 0.988 0.503 2.239742e-121 21 Igkc
2.448200e-124 4.5171688 0.996 0.654 4.896399e-121 21 Ighm
6.013557e-113 2.6552746 0.965 0.221 1.202711e-109 21 Iglc1
1.753400e-101 3.2516509 0.950 0.338 3.506800e-98 21 Mzb1
1.043355e-94 3.0621636 0.946 0.355 2.086711e-91 21 Iglc2
1.194685e-89 3.3653450 0.730 0.112 2.389370e-86 21 Ighv1-39
1.288364e-75 2.7556636 0.876 0.517 2.576728e-72 21 Txndc5
8.159333e-121 4.2235024 0.985 0.094 1.631867e-117 22 Clu
3.261344e-117 4.3829823 0.995 0.254 6.522687e-114 22 Efemp1
6.310086e-110 3.3219906 0.995 0.392 1.262017e-106 22 Aebp1
6.029672e-109 3.9765111 0.995 0.211 1.205934e-105 22 Dcn
1.248169e-105 3.4955474 0.985 0.256 2.496337e-102 22 Serping1
6.870331e-105 3.5278683 0.990 0.392 1.374066e-101 22 Rarres2
1.234470e-101 3.8778375 0.995 0.416 2.468940e-98 22 C3
2.942678e-70 3.6690370 0.901 0.312 5.885355e-67 22 Igfbp6
2.685445e-40 3.0280798 0.803 0.447 5.370890e-37 22 Hspb1
1.373034e-31 3.2171966 0.759 0.350 2.746067e-28 22 Slpi
2.079035e-163 9.1231462 0.980 0.030 4.158069e-160 23 S100a9
7.988334e-130 6.0250691 0.851 0.054 1.597667e-126 23 Retnlg
1.387399e-119 8.8399015 0.990 0.277 2.774799e-116 23 S100a8
5.537872e-106 4.5774104 0.990 0.603 1.107574e-102 23 Msrb1
1.377552e-99 4.4032719 0.975 0.554 2.755104e-96 23 Il1b
1.387203e-97 4.2545411 0.945 0.402 2.774406e-94 23 Csf3r
5.856747e-97 4.9925113 0.955 0.349 1.171349e-93 23 Slpi
1.031866e-88 6.0578696 0.900 0.410 2.063732e-85 23 G0s2
9.645739e-79 4.0191416 0.846 0.292 1.929148e-75 23 Hcar2
1.785933e-44 4.1705390 0.756 0.419 3.571866e-41 23 Ifitm1
1.248688e-108 2.5516898 1.000 0.205 2.497376e-105 24 Ighv11-2
9.006299e-102 2.3238014 1.000 0.293 1.801260e-98 24 Igkv14-126
2.336318e-83 2.9943866 1.000 0.303 4.672637e-80 24 Ms4a1
8.714840e-78 2.7526211 1.000 0.357 1.742968e-74 24 Iglc2
2.600791e-76 3.2119901 1.000 0.454 5.201582e-73 24 Cd79a
6.140559e-75 2.2582757 0.986 0.340 1.228112e-71 24 Mzb1
2.086910e-74 2.9536397 0.993 0.454 4.173819e-71 24 Cd79b
1.402111e-70 2.1604940 0.939 0.274 2.804222e-67 24 Fcmr
1.735769e-70 2.4212141 0.980 0.344 3.471537e-67 24 Iglc3
7.664837e-66 2.4033807 0.993 0.537 1.532967e-62 24 Plac8
1.419026e-68 4.6113177 0.972 0.294 2.838052e-65 25 Cldn5
5.750502e-66 7.1624312 1.000 0.590 1.150100e-62 25 Fabp4
3.722888e-64 3.6131564 0.889 0.203 7.445775e-61 25 Car4
4.557783e-64 4.5197296 0.963 0.267 9.115566e-61 25 Cav1
4.305298e-63 5.0160194 0.991 0.454 8.610597e-60 25 Ly6c1
4.383829e-63 4.4785232 0.972 0.475 8.767658e-60 25 Mgll
1.646428e-59 3.9209942 0.972 0.281 3.292856e-56 25 Sparc
5.743588e-53 4.4041692 0.935 0.448 1.148718e-49 25 Hspb1
4.038108e-48 5.1026001 0.907 0.325 8.076216e-45 25 Gpihbp1
2.002827e-12 4.0668830 0.213 0.509 4.005654e-09 25 Car3
8.376765e-123 4.5393693 1.000 0.031 1.675353e-119 26 Alox15
1.038854e-70 3.7094278 1.000 0.196 2.077709e-67 26 Serpinb2
3.203417e-70 4.8314007 1.000 0.206 6.406835e-67 26 Prg4
1.413965e-65 6.8955031 0.979 0.196 2.827930e-62 26 Saa3
5.197820e-60 5.0189462 1.000 0.417 1.039564e-56 26 Fn1
8.619503e-58 6.6989202 0.989 0.350 1.723901e-54 26 Slpi
4.260168e-55 3.3729247 1.000 0.513 8.520335e-52 26 Ecm1
5.470622e-54 3.7158681 0.989 0.521 1.094124e-50 26 Lyz1
3.358556e-51 3.3181598 1.000 0.646 6.717113e-48 26 Wfdc17
2.748532e-11 3.5282562 0.638 0.257 5.497064e-08 26 Cxcl13
5.672058e-102 3.9803227 0.886 0.059 1.134412e-98 27 Klk1
2.631656e-53 4.9593429 1.000 0.197 5.263312e-50 27 Siglech
1.347054e-44 4.3966876 1.000 0.763 2.694107e-41 27 Bst2
6.758234e-44 4.4148955 0.929 0.183 1.351647e-40 27 Cox6a2
3.554736e-43 3.3774794 1.000 0.686 7.109472e-40 27 Irf8
1.409637e-41 3.0580651 0.986 0.467 2.819275e-38 27 Atp1b1
3.788852e-41 3.2985962 1.000 0.538 7.577704e-38 27 Plac8
4.363986e-41 3.0043321 0.986 0.549 8.727972e-38 27 Rnase6
5.228466e-36 3.0711277 0.986 0.423 1.045693e-32 27 Ly6d
3.949152e-35 3.1100210 0.886 0.312 7.898303e-32 27 Cd209d

5.1 Rename low-level identities

Using the annotations from SingleR and the differentially expressed genes, we can assign annotations to each cell type cluster. For this, we heavily reference literature and seek to identify known cell type marker genes. We will further refine and adjust these results when we have more granular cell type annotations after subclustering.

#Rename clusters based on their deduced identities
data.integrated <- RenameIdents(data.integrated,
                                '0' = "Macrophages",
                                '1' = "Macrophages",
                                '2' = "Monocytes",
                                '3' = "B Cells",
                                '4' = "Macrophages",
                                '5' = "Dendritic Cells",
                                '6' = "Dendritic Cells",
                                '7' = "T Cells",
                                '8' = "NK Cells",
                                '9' = "Stromal Cells",
                                '10' = "T Cells",
                                '11' = "T Cells",
                                '12' = "Macrophages",
                                '13' = "T Cells",
                                '14' = "ILCs",
                                '15' = "Dendritic Cells",
                                '16' = "Monocytes",
                                '17' = "Plasma Cells",
                                '18' = "T Cells",
                                '19' = "Mast Cells",
                                '20' = "Dendritic Cells",
                                '21' = "Plasma Cells",
                                '22' = "Stromal Cells",
                                '23' = "Neutrophils",
                                '24' = "B Cells",
                                '25' = "Endothelial Cells",
                                '26' = "Macrophages",
                                '27' = "Dendritic Cells")

6 Plot lowlevel clusters

#Save and plot the low level identification
data.integrated$lowlevel <- Idents(data.integrated)
DimPlot(data.integrated, label=T) + NoLegend()

7 High-level Cluster Identification

For each of the 12 cell type clusters, we will subset the cells and repeat clustering to define subclusters. In this vignette, we refer to these subclusters as high-resolution or high-level clusters.

7.1 T and NK Cells

7.1.1 PCA and clustering

Run PCA and determine dimensions for clustering.

NKT <- subset(data.integrated, idents = c("T Cells", "NK Cells"))
NKT <- RunPCA(NKT, features=VariableFeatures(NKT),dims=1:30, verbose=FALSE)
ElbowPlot(NKT, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

NKT <- FindNeighbors(NKT, dims=1:15, verbose=FALSE) 
NKT <- FindClusters(NKT, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(NKT, prefix="integrated_snn_res.") 

# Choose resolution
NKT <- FindClusters(NKT, resolution=0.31, verbose=FALSE) 

7.1.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
NKT <- RunUMAP(NKT, dims=1:15, verbose=FALSE) 
## Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
## To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
## This message will be shown once per session
# Plot UMAP
DimPlot(NKT, reduction="umap", label=TRUE) + NoLegend() 

7.1.3 Find Subcluster Markers

Find differentially expressed markers between the subclusters.

# Find Markers
NKT.markers <- FindAllMarkers(NKT, only.pos = TRUE, min.pct = 0.50, logfc.threshold = 0.25, max.cells.per.ident=500, verbose=FALSE) 

# List top 10 genes for each subcluster in a table
NKT.markers.top10 <- NKT.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC)
NKT.markers.top10 %>% formattable()
p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
2.748997e-125 2.9709229 0.923 0.353 5.497994e-122 0 Ncr1
1.077215e-105 2.4938874 0.888 0.487 2.154429e-102 0 Klrb1c
1.454266e-90 2.6700916 0.917 0.582 2.908531e-87 0 Xcl1
6.652505e-86 1.9931426 0.861 0.490 1.330501e-82 0 Klre1
1.794730e-81 2.5017321 0.821 0.458 3.589461e-78 0 Prf1
1.316373e-79 2.6402843 0.818 0.491 2.632746e-76 0 Irf8
9.487306e-65 4.0725925 0.753 0.443 1.897461e-61 0 Gzma
2.479872e-60 1.8601378 0.827 0.597 4.959744e-57 0 Serpinb9
5.876576e-48 2.0259246 0.586 0.111 1.175315e-44 0 Klra4
1.331234e-33 2.3267827 0.596 0.209 2.662469e-30 0 Klra8
4.525918e-115 2.3637819 0.999 0.851 9.051836e-112 1 Ccl5
6.534789e-115 2.9545144 0.930 0.434 1.306958e-111 1 Cd8b1
1.155064e-110 2.9525152 0.898 0.488 2.310128e-107 1 Gzmk
7.389535e-90 2.4111543 0.859 0.395 1.477907e-86 1 Cd8a
3.696870e-73 1.3234135 0.997 0.877 7.393741e-70 1 Nkg7
6.451329e-47 0.9639693 0.990 0.892 1.290266e-43 1 Ms4a4b
2.430238e-28 0.9617370 0.713 0.565 4.860475e-25 1 Tox
1.172646e-25 0.9214126 0.940 0.808 2.345293e-22 1 Lat
5.685223e-20 1.5205153 0.516 0.364 1.137045e-16 1 Trbv3
7.604307e-20 1.1628097 0.733 0.635 1.520861e-16 1 Ccl4
3.161560e-59 1.3359866 0.798 0.417 6.323120e-56 2 Tnfrsf4
2.374827e-53 1.7842772 0.911 0.710 4.749654e-50 2 Ly6a
9.234563e-39 1.7020911 0.690 0.358 1.846913e-35 2 Cd4
6.175100e-33 1.1042290 0.854 0.701 1.235020e-29 2 Cd28
2.265966e-31 1.2232373 0.636 0.401 4.531933e-28 2 Trbv13-1
1.181441e-26 1.1589386 0.754 0.577 2.362881e-23 2 Cd5
1.123120e-25 1.2470746 0.756 0.489 2.246240e-22 2 Ifi27l2a
4.244419e-22 1.6950581 0.641 0.456 8.488837e-19 2 Cd40lg
1.382343e-07 1.2505646 0.573 0.442 2.764686e-04 2 Izumo1r
3.358300e-06 1.6311205 0.604 0.540 6.716600e-03 2 Ifng
1.395754e-95 1.9608124 0.976 0.554 2.791508e-92 3 Ly6c2
9.780738e-51 1.0215430 0.820 0.469 1.956148e-47 3 Cd160
2.554998e-41 1.0928256 0.784 0.508 5.109997e-38 3 Ms4a4c
1.367099e-33 1.0213058 0.910 0.685 2.734198e-30 3 Cd7
3.034127e-33 0.9331457 0.890 0.624 6.068254e-30 3 Klrk1
1.563870e-30 1.1672667 0.694 0.403 3.127740e-27 3 Klra7
2.850647e-30 1.1078243 0.620 0.373 5.701294e-27 3 Trbv12-2
3.858589e-23 0.8376468 0.792 0.505 7.717179e-20 3 Sell
3.192907e-21 0.8167955 0.747 0.525 6.385814e-18 3 Klrc1
1.239413e-03 1.2384853 0.511 0.424 1.000000e+00 3 Klra1
3.020077e-142 3.3248297 0.962 0.367 6.040154e-139 4 Ckb
1.069686e-126 2.6724631 0.904 0.305 2.139372e-123 4 Actn2
1.306498e-116 3.7645344 0.853 0.258 2.612995e-113 4 Cd163l1
2.760289e-113 2.3212202 0.964 0.614 5.520578e-110 4 Capg
5.406230e-111 2.4175701 0.897 0.289 1.081246e-107 4 Ltb4r1
4.786671e-108 3.1258657 0.845 0.229 9.573342e-105 4 Tcrg-V6
3.999478e-103 4.3816012 0.870 0.429 7.998956e-100 4 Trdv4
1.252668e-92 2.4527070 0.787 0.045 2.505337e-89 4 Blk
1.678410e-89 2.4626528 0.848 0.451 3.356819e-86 4 Serpinb1a
6.080007e-75 2.4147457 0.813 0.417 1.216001e-71 4 Trdc
2.028297e-78 2.7335179 0.829 0.357 4.056595e-75 5 Ccr7
3.094447e-33 1.1718242 0.885 0.570 6.188893e-30 5 Tcf7
3.070180e-26 0.7224290 0.885 0.617 6.140361e-23 5 Acp5
3.716769e-26 1.2282576 0.659 0.376 7.433539e-23 5 Rflnb
5.119457e-17 0.8239742 0.741 0.519 1.023891e-13 5 Smc4
2.997028e-15 1.1835954 0.563 0.313 5.994055e-12 5 Ly6c1
3.203272e-13 1.6520094 0.571 0.276 6.406544e-10 5 Dapl1
1.900742e-12 0.9117834 0.695 0.511 3.801483e-09 5 Gm8369
1.761643e-05 0.6753112 0.522 0.328 3.523287e-02 5 Trbv19
4.668663e-05 1.0448251 0.517 0.344 9.337325e-02 5 Actn1
1.852783e-123 3.2098672 0.976 0.442 3.705566e-120 6 Tnfrsf4
6.663103e-122 2.5449899 0.892 0.192 1.332621e-118 6 Foxp3
2.087155e-100 4.7943474 0.876 0.340 4.174310e-97 6 Areg
3.871402e-100 2.4728109 0.946 0.544 7.742804e-97 6 Ikzf2
4.372832e-93 2.5359951 0.967 0.678 8.745665e-90 6 Ctla4
1.036886e-67 1.8526294 0.965 0.730 2.073771e-64 6 Tnfrsf18
3.742644e-50 1.8410683 0.896 0.640 7.485287e-47 6 Traf1
9.172411e-49 2.4313730 0.742 0.391 1.834482e-45 6 Klrg1
1.375337e-44 1.9062082 0.928 0.759 2.750674e-41 6 Odc1
1.702011e-05 3.2557103 0.508 0.422 3.404022e-02 6 Cxcl2
3.358685e-120 3.0970145 0.979 0.330 6.717369e-117 7 Stmn1
3.394283e-113 3.2093548 0.937 0.268 6.788566e-110 7 Pclaf
1.024165e-112 2.8773806 1.000 0.797 2.048331e-109 7 Hmgb2
2.037985e-107 2.8747000 0.985 0.546 4.075970e-104 7 Tuba1b
2.258869e-105 2.6833454 0.912 0.212 4.517738e-102 7 Birc5
9.059349e-105 3.0666432 0.888 0.182 1.811870e-101 7 Ube2c
9.868234e-70 4.0285541 0.825 0.398 1.973647e-66 7 Hist1h2ap
9.549303e-38 2.9162960 0.692 0.404 1.909861e-34 7 Cd74
1.012726e-15 3.4203059 0.653 0.484 2.025452e-12 7 Igkc
5.106003e-03 3.8423618 0.571 0.512 1.000000e+00 7 Ighm
3.586761e-30 3.3238092 0.984 0.610 7.173521e-27 8 Hilpda
4.585762e-28 3.3670334 0.921 0.370 9.171523e-25 8 Areg
9.451679e-28 2.2881447 0.952 0.481 1.890336e-24 8 Cd40lg
1.366428e-21 1.9706842 0.968 0.721 2.732856e-18 8 Bcl2a1d
7.912297e-17 1.9157922 0.730 0.327 1.582459e-13 8 Csf2
7.135952e-15 1.8927814 0.794 0.387 1.427190e-11 8 Gata3
6.626730e-14 2.1323989 0.810 0.524 1.325346e-10 8 Phlda1
2.079250e-13 1.8180661 0.746 0.380 4.158501e-10 8 Ramp3
4.587388e-13 2.1635938 0.778 0.373 9.174776e-10 8 Tnfsf11
9.906248e-10 1.8546223 0.683 0.463 1.981250e-06 8 Atf3

7.1.4 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

NKT <- RenameIdents(NKT,
                               '0' = "NK Cells",
                               '1' = "Effector CD8+ T Cells",
                               '2' = "Th1 CD4+ T Cells",
                               '3' = "NKT Cells",
                               '4' = "gd T cells",
                               '5' = "Memory CD8+ T Cells",
                               '6' = "Tregs",
                               '7' = "Proliferating CD8+ T Cells",
                               '8' = "Th2 CD4+ T Cells")
NKT$highlevel <- Idents(NKT)

7.1.5 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(NKT, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="NK and T Cells")

7.2 Dendritic Cells

7.2.1 PCA and clustering

Run PCA and determine dimensions for clustering.

DCs <- subset(data.integrated, idents = c("Dendritic Cells"))
DCs <- RunPCA(DCs, features=VariableFeatures(DCs),dims=1:30, verbose=FALSE)
ElbowPlot(DCs, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

DCs <- FindNeighbors(DCs, dims=1:15, verbose=FALSE) 
DCs <- FindClusters(DCs, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(DCs, prefix="integrated_snn_res.") 

# Choose resolution
DCs <- FindClusters(DCs, resolution=0.3, verbose=FALSE) 

7.2.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
DCs <- RunUMAP(DCs, dims=1:15, verbose=FALSE) 

# Plot UMAP
DimPlot(DCs, reduction="umap", label=TRUE) + NoLegend() 

7.2.3 Find Subcluster Markers

Find differentially expressed markers between the subclusters.

# Find Markers
DCs.markers <- FindAllMarkers(DCs, only.pos = TRUE, min.pct = 0.50, logfc.threshold = 0.25, max.cells.per.ident=500, verbose=FALSE) 

# List top 10 genes for each subcluster in a table
DCs.markers.top10 <- DCs.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC)
DCs.markers.top10 %>% formattable()
p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
4.132616e-103 1.7822305 0.978 0.587 8.265232e-100 0 Ppt1
1.288811e-100 1.6192123 0.991 0.694 2.577622e-97 0 Naaa
8.984430e-96 1.4858333 0.977 0.706 1.796886e-92 0 Clec9a
3.122629e-89 1.5955288 0.993 0.699 6.245259e-86 0 Irf8
4.584179e-89 1.4155084 1.000 1.000 9.168358e-86 0 Cst3
2.718695e-87 1.2981060 0.984 0.648 5.437390e-84 0 Wdfy4
4.434739e-84 1.2162618 0.958 0.646 8.869479e-81 0 Ifi205
1.350882e-83 1.5317654 0.936 0.628 2.701763e-80 0 Rtl8c
6.323133e-82 1.2073355 0.910 0.472 1.264627e-78 0 Xcr1
3.630771e-78 1.3222912 0.973 0.726 7.261543e-75 0 Cd24a
1.551424e-115 2.2099266 0.999 0.725 3.102847e-112 1 Ifitm3
4.267340e-110 2.8759749 0.912 0.366 8.534680e-107 1 Cd209a
4.085703e-105 1.8779507 0.956 0.672 8.171405e-102 1 Ms4a6c
3.146464e-80 2.3980574 0.884 0.496 6.292927e-77 1 Tnfsf9
2.508910e-79 1.6602456 0.945 0.566 5.017820e-76 1 Wfdc17
1.230073e-70 2.0457941 0.844 0.545 2.460145e-67 1 Clec10a
2.385322e-54 1.6405794 0.888 0.687 4.770644e-51 1 Jun
8.504735e-50 1.7749085 0.872 0.647 1.700947e-46 1 Dusp2
8.630388e-43 2.0651958 0.941 0.805 1.726078e-39 1 Il1b
8.615181e-32 1.8232759 0.690 0.502 1.723036e-28 1 Mgl2
7.970474e-158 3.4318171 1.000 0.768 1.594095e-154 2 Ccl6
8.693920e-157 4.1244493 0.988 0.289 1.738784e-153 2 Fn1
5.098872e-151 3.5007258 1.000 0.822 1.019774e-147 2 Lyz2
1.143521e-142 3.1419724 0.979 0.368 2.287043e-139 2 Ear2
5.514963e-136 5.6676315 0.926 0.121 1.102993e-132 2 Retnla
7.994833e-134 2.7005853 0.977 0.588 1.598967e-130 2 Capg
2.472123e-126 2.8446938 0.971 0.389 4.944246e-123 2 Pltp
7.898594e-125 3.3354109 0.946 0.373 1.579719e-121 2 Lpl
2.958143e-72 3.3111296 0.817 0.291 5.916285e-69 2 Cxcl2
1.359873e-67 3.6503698 0.815 0.424 2.719747e-64 2 Lyz1
3.948359e-162 5.2373921 0.995 0.460 7.896718e-159 3 Ccr7
2.946617e-148 3.7548861 0.976 0.585 5.893233e-145 3 Tmem123
7.520760e-135 3.0238858 0.957 0.480 1.504152e-131 3 Il4i1
2.787873e-132 4.8110206 0.928 0.354 5.575747e-129 3 Fscn1
4.378408e-125 3.2792394 0.940 0.606 8.756817e-122 3 Tspan3
1.595677e-122 5.5758262 0.938 0.618 3.191354e-119 3 Ccl5
4.645795e-122 2.9180677 0.945 0.459 9.291589e-119 3 Cd63
1.048176e-92 4.3353679 0.877 0.566 2.096353e-89 3 Fabp5
2.836357e-69 3.4714497 0.907 0.729 5.672714e-66 3 Epsti1
9.301167e-67 2.9531972 0.734 0.304 1.860233e-63 3 Apol7c
4.215201e-84 1.3333202 0.986 0.531 8.430402e-81 4 Mcm3
4.637263e-74 1.3543520 0.952 0.591 9.274525e-71 4 Dut
3.991403e-64 1.0912306 0.983 0.710 7.982806e-61 4 Dctpp1
1.333539e-63 1.0734905 0.910 0.503 2.667079e-60 4 Mcm6
7.765374e-60 1.1151922 1.000 0.776 1.553075e-56 4 Naaa
7.118605e-52 0.9959128 0.990 0.740 1.423721e-48 4 Wdfy4
1.057127e-51 0.9945405 0.990 0.730 2.114254e-48 4 Ifi205
8.083829e-51 1.0382285 1.000 0.694 1.616766e-47 4 Ppt1
1.619567e-50 1.0234930 0.844 0.496 3.239134e-47 4 Mcm7
2.692590e-48 1.0551616 0.955 0.712 5.385180e-45 4 Rtl8c
1.637050e-83 3.8443871 1.000 0.427 3.274099e-80 5 Hist1h2ap
5.593832e-81 2.1982939 1.000 0.487 1.118766e-77 5 Mki67
7.983693e-81 2.2504485 1.000 0.502 1.596739e-77 5 Top2a
2.479643e-80 2.8330287 1.000 0.425 4.959287e-77 5 Pclaf
3.926379e-80 2.6655453 1.000 0.541 7.852759e-77 5 Stmn1
3.205970e-79 2.6808852 1.000 0.856 6.411940e-76 5 Tuba1b
3.112969e-78 2.7946093 0.994 0.583 6.225938e-75 5 H2afx
7.947979e-77 2.3793323 1.000 0.887 1.589596e-73 5 Tubb5
1.156843e-76 2.9837176 1.000 0.788 2.313685e-73 5 Hmgb2
6.444020e-72 2.1294641 0.976 0.379 1.288804e-68 5 Birc5
7.470152e-64 3.3448049 0.992 0.576 1.494030e-60 6 Cd7
2.973018e-55 2.4455539 1.000 0.838 5.946036e-52 6 Bst2
3.052503e-49 1.8254593 0.827 0.231 6.105005e-46 6 AC140186.1
3.463277e-48 2.6109826 0.874 0.300 6.926553e-45 6 Cd209d
4.257668e-45 2.5952625 0.961 0.593 8.515335e-42 6 Plac8
3.072744e-42 1.5683703 0.992 0.721 6.145488e-39 6 Ms4a6c
3.899641e-41 1.6613940 0.961 0.620 7.799282e-38 6 Klrd1
1.884725e-31 2.6749087 0.858 0.684 3.769450e-28 6 Ly6c2
2.692111e-30 2.4214489 0.850 0.478 5.384222e-27 6 Ccl4
7.846905e-28 1.7158160 0.992 0.889 1.569381e-24 6 Lgals1
3.273773e-51 1.4380065 0.966 0.560 6.547547e-48 7 Mcm5
5.499006e-50 1.5653553 0.966 0.506 1.099801e-46 7 Mcm7
6.510480e-47 1.3015235 0.933 0.437 1.302096e-43 7 Lig1
1.939022e-46 1.2778154 0.966 0.549 3.878043e-43 7 Mcm3
6.904579e-46 1.6625803 0.966 0.547 1.380916e-42 7 Stmn1
4.012971e-43 1.2662324 0.941 0.517 8.025943e-40 7 Mcm6
1.519881e-40 1.4844533 0.941 0.605 3.039762e-37 7 Dut
2.166224e-40 1.3758651 1.000 0.857 4.332448e-37 7 Tuba1b
2.278872e-39 1.3531315 0.958 0.640 4.557745e-36 7 Pcna
9.508446e-33 1.7763192 0.840 0.435 1.901689e-29 7 Pclaf
1.382035e-55 2.3465094 1.000 0.419 2.764069e-52 8 Ccnb2
8.491016e-53 2.4666880 0.990 0.534 1.698203e-49 8 Cenpa
5.599518e-52 3.0223069 1.000 0.791 1.119904e-48 8 Hmgb2
1.593957e-47 2.2107799 0.951 0.388 3.187915e-44 8 Birc5
1.605085e-46 3.1480437 0.971 0.597 3.210171e-43 8 Ube2c
1.598591e-43 1.7866309 0.893 0.259 3.197182e-40 8 Cenpf
4.182377e-43 1.8430711 0.961 0.621 8.364754e-40 8 Cks1b
3.046095e-38 2.1950545 1.000 0.589 6.092189e-35 8 H2afx
1.613704e-28 1.7158834 0.816 0.307 3.227408e-25 8 Cdc20
1.563958e-27 2.1799156 0.874 0.588 3.127915e-24 8 Tubb4b
7.887223e-34 1.6318980 0.959 0.641 1.577445e-30 9 Serpinb9
9.395487e-34 3.4733534 0.918 0.707 1.879097e-30 9 Cxcl9
1.495563e-31 1.6610002 1.000 0.895 2.991127e-28 9 Bcl2a1b
1.999405e-30 2.2401177 0.907 0.672 3.998811e-27 9 Serpina3g
1.161478e-29 1.6731534 0.959 0.751 2.322957e-26 9 Serpinb6b
5.098181e-29 1.5474188 0.979 0.730 1.019636e-25 9 Basp1
2.961256e-27 1.9261900 0.918 0.671 5.922513e-24 9 Ccl22
7.926767e-24 2.1088656 0.804 0.444 1.585353e-20 9 Il12b
8.683215e-22 1.4959547 0.938 0.764 1.736643e-18 9 Bcl2a1a
1.758240e-20 2.5229281 0.784 0.577 3.516480e-17 9 Ccl17
2.030730e-92 3.9472307 0.875 0.009 4.061461e-89 10 Klk1
7.578645e-46 4.4072128 0.931 0.235 1.515729e-42 10 Cox6a2
1.798224e-45 4.9628856 1.000 0.276 3.596447e-42 10 Siglech
3.845719e-43 4.3795367 1.000 0.522 7.691438e-40 10 Ctsl
7.793009e-43 4.3316323 1.000 0.840 1.558602e-39 10 Bst2
2.970653e-42 3.8471759 1.000 0.597 5.941305e-39 10 Plac8
2.241284e-41 4.8195969 0.972 0.346 4.482569e-38 10 Ly6d
1.046250e-35 4.1397075 0.958 0.685 2.092499e-32 10 Ly6c2
5.011100e-30 3.3197685 0.903 0.605 1.002220e-26 10 Ly6a
1.439982e-22 3.3731883 0.861 0.482 2.879964e-19 10 Ccl4

7.2.4 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

DCs <- RenameIdents(DCs,
                               '0' = "cDC1",
                               '1' = "cDC2",
                               '2' = "moDCs",
                               '3' = "Activated cDC2",
                               '4' = "cDC1",
                               '5' = "Proliferating cDC1",
                               '6' = "cDC2",
                               '7' = "Proliferating cDC2",
                               '8' = "Proliferating cDC1",
                               '9' = "Activated cDC1",
                               '10' = "pDCs")
DCs$highlevel <- Idents(DCs)

7.2.5 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(DCs, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="Dendritic Cells")

7.3 B and Plasma Cells

7.3.1 PCA and clustering

Run PCA and determine dimensions for clustering.

Bcells <- subset(data.integrated, idents = c("B Cells","Plasma Cells"))
Bcells <- RunPCA(Bcells, features=VariableFeatures(Bcells),dims=1:30, verbose=FALSE)
ElbowPlot(Bcells, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

Bcells <- FindNeighbors(Bcells, dims=1:10, verbose=FALSE) 
Bcells <- FindClusters(Bcells, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(Bcells, prefix="integrated_snn_res.") 

# Choose resolution
Bcells <- FindClusters(Bcells, resolution=0.2, verbose=FALSE) 

7.3.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
Bcells <- RunUMAP(Bcells, dims=1:10, verbose=FALSE) 

# Plot UMAP
DimPlot(Bcells, reduction="umap", label=TRUE) + NoLegend() 

7.3.3 Find Subcluster Markers

Find differentially expressed markers between the subclusters.

# Find Markers
Bcells.markers <- FindAllMarkers(Bcells, only.pos = TRUE, min.pct = 0.50, logfc.threshold = 0.25, max.cells.per.ident=500, verbose=FALSE) 

# List top 10 genes for each subcluster in a table
Bcells.markers.top10 <- Bcells.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC)
Bcells.markers.top10 %>% formattable()
p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
3.949165e-80 1.1561122 0.999 0.946 7.898330e-77 0 Tmsb10
6.183227e-74 2.0433307 0.753 0.306 1.236645e-70 0 Vpreb3
6.622462e-71 1.9816668 0.826 0.508 1.324492e-67 0 Ighd
1.429516e-66 1.4297265 0.768 0.392 2.859032e-63 0 Gpr171
2.371830e-66 1.9354872 0.894 0.703 4.743659e-63 0 Ccr7
1.823172e-57 2.1273507 0.721 0.289 3.646345e-54 0 Fcer2a
3.163160e-31 1.3518411 0.740 0.509 6.326319e-28 0 Cd55
8.751690e-24 1.4305531 0.779 0.625 1.750338e-20 0 Chchd10
1.775079e-16 1.2792264 0.620 0.483 3.550158e-13 0 Sell
3.983837e-06 1.2062081 0.557 0.520 7.967674e-03 0 Dusp2
5.947565e-59 1.3707802 0.941 0.799 1.189513e-55 1 Plac8
2.254139e-51 0.9215450 0.791 0.433 4.508278e-48 1 Anxa2
8.235375e-42 0.8690719 0.974 0.815 1.647075e-38 1 Lsp1
2.118618e-38 0.8855524 0.979 0.837 4.237235e-35 1 Napsa
1.492348e-34 0.9739292 0.788 0.503 2.984697e-31 1 Gm15987
1.984907e-34 1.1148646 0.953 0.849 3.969815e-31 1 Vim
7.395422e-34 0.9860197 0.732 0.483 1.479084e-30 1 Atf3
1.480452e-29 1.0547492 0.724 0.423 2.960904e-26 1 Apoe
8.601954e-26 0.9159644 0.787 0.663 1.720391e-22 1 Ass1
1.989738e-13 0.9222719 0.762 0.625 3.979476e-10 1 Lgals1
1.436064e-85 4.9620643 0.933 0.398 2.872128e-82 2 Iglv1
1.176776e-51 5.5486325 0.714 0.454 2.353552e-48 2 Igkv3-5
9.566655e-43 7.6744493 0.724 0.516 1.913331e-39 2 Igkv3-2
2.825904e-38 4.5668984 0.670 0.535 5.651809e-35 2 Ighv1-72
2.170926e-36 6.1160248 0.635 0.517 4.341852e-33 2 Igkv3-10
8.081530e-31 4.8733926 0.517 0.352 1.616306e-27 2 Igkv6-20
6.562030e-29 4.8319326 0.621 0.377 1.312406e-25 2 Ighv1-53
5.052467e-14 4.7273671 0.510 0.468 1.010493e-10 2 Ighv1-76
2.706340e-11 5.1128937 0.562 0.520 5.412679e-08 2 Igkv17-121
7.552254e-04 4.9372836 0.500 0.504 1.000000e+00 2 Igkv2-109
1.004848e-39 1.3533445 0.993 0.823 2.009696e-36 3 Plac8
3.141348e-33 1.3248578 0.899 0.494 6.282696e-30 3 Anxa2
3.293647e-30 1.1701665 0.919 0.549 6.587293e-27 3 Gm15987
4.711578e-28 1.1724651 0.993 0.866 9.423156e-25 3 Vim
8.857638e-26 1.0974721 0.745 0.403 1.771528e-22 3 Ms4a6c
1.587372e-25 1.2842586 0.940 0.642 3.174744e-22 3 Lgals1
6.955958e-22 1.1724903 0.839 0.683 1.391192e-18 3 Ass1
1.188083e-13 1.0792004 0.799 0.651 2.376166e-10 3 Ccnd2
1.647707e-12 1.3080328 0.705 0.531 3.295414e-09 3 Atf3
1.982996e-11 1.8630146 0.570 0.334 3.965991e-08 3 Zcwpw1
6.971532e-64 6.0601780 1.000 0.494 1.394306e-60 4 Igkv14-126
6.371823e-61 5.5969025 0.991 0.463 1.274365e-57 4 Ighv11-2
7.563416e-48 2.4402445 1.000 0.933 1.512683e-44 4 Igkc
4.946725e-46 1.9025852 1.000 0.578 9.893450e-43 4 Txndc5
2.804988e-45 2.4306529 1.000 0.368 5.609977e-42 4 Jchain
1.115959e-44 2.1865127 1.000 0.974 2.231917e-41 4 Ighm
3.689103e-42 1.8786355 1.000 0.605 7.378205e-39 4 Xbp1
4.451806e-19 1.9706956 0.810 0.588 8.903613e-16 4 Ly6c2
2.147498e-12 2.4063639 0.560 0.279 4.294996e-09 4 Igkv9-120
3.779843e-09 2.0774284 0.603 0.277 7.559685e-06 4 Ighv1-18
5.272394e-59 4.1555659 0.974 0.373 1.054479e-55 5 Lyz2
1.164038e-56 3.2240030 1.000 0.463 2.328076e-53 5 Ighv11-2
2.042494e-55 2.9338020 1.000 0.494 4.084988e-52 5 Igkv14-126
3.473896e-43 3.6937626 0.905 0.337 6.947791e-40 5 C1qb
6.361116e-35 3.5908240 0.862 0.374 1.272223e-31 5 Ccl6
6.476830e-35 3.2286400 0.862 0.446 1.295366e-31 5 C1qc
8.603512e-34 2.8780690 0.897 0.530 1.720702e-30 5 Ctsb
1.463475e-11 3.5243671 0.750 0.352 2.926949e-08 5 Ighv1-39
1.663488e-11 4.1415211 0.681 0.324 3.326976e-08 5 Igkv6-17
1.169156e-03 3.2680513 0.500 0.019 1.000000e+00 5 Tpsb2
4.390046e-22 2.3469886 1.000 0.186 8.780092e-19 6 Mki67
6.209868e-21 2.8322883 1.000 0.331 1.241974e-17 6 Ube2c
6.145631e-19 2.2047067 1.000 0.462 1.229126e-15 6 Top2a
1.968486e-17 2.8917450 1.000 0.436 3.936971e-14 6 Hist1h2ap
1.179147e-14 2.0782717 1.000 0.788 2.358293e-11 6 Hsp90b1
1.204119e-14 2.6744748 1.000 0.728 2.408239e-11 6 Sec11c
1.464429e-14 2.5461524 1.000 0.618 2.928858e-11 6 Hmgb2
4.376904e-14 2.5547172 1.000 0.551 8.753808e-11 6 Edem1
5.410043e-13 2.0988436 0.963 0.478 1.082009e-09 6 Ighv11-2
5.985159e-13 2.3634500 1.000 0.715 1.197032e-09 6 Rexo2

7.3.4 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

Bcells <- RenameIdents(Bcells,
                               '0' = "Naive B Cells",
                               '1' = "Memory B Cells",
                               '2' = "Plasma Cells",
                               '3' = "B-1a Cells",
                               '4' = "Plasma Cells",
                               '5' = "Plasma Cells",
                               '6' = "Plasma Cells")
Bcells$highlevel <- Idents(Bcells)

7.3.5 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(Bcells, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="B and Plasma Cells")

7.4 ILCs

7.4.1 PCA and clustering

Run PCA and determine dimensions for clustering.

ILCs <- subset(data.integrated, idents = c("ILCs"))
ILCs <- RunPCA(ILCs, features=VariableFeatures(ILCs),dims=1:30, verbose=FALSE)
ElbowPlot(ILCs, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

ILCs <- FindNeighbors(ILCs, dims=1:10, verbose=FALSE) 
ILCs <- FindClusters(ILCs, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(ILCs, prefix="integrated_snn_res.") 

# Choose resolution
ILCs <- FindClusters(ILCs, resolution=0.3, verbose=FALSE) 

7.4.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
ILCs <- RunUMAP(ILCs, dims=1:10, verbose=FALSE) 

# Plot UMAP
DimPlot(ILCs, reduction="umap", label=TRUE) + NoLegend() 

7.4.3 Find Subcluster Markers

Find differentially expressed markers between the subclusters.

# Find Markers
ILCs.markers <- FindAllMarkers(ILCs, only.pos = TRUE, min.pct = 0.50, logfc.threshold = 0.25, max.cells.per.ident=500, verbose=FALSE) 

# List top 10 genes for each subcluster in a table
ILCs.markers.top10 <- ILCs.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC)
ILCs.markers.top10 %>% formattable()
p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
2.620692e-45 2.0358511 0.908 0.419 5.241384e-42 0 Arg1
3.017984e-24 0.9674017 0.887 0.581 6.035968e-21 0 Ccl1
3.698125e-15 0.9871205 0.878 0.660 7.396250e-12 0 Ccdc184
4.133215e-09 1.0557260 0.617 0.281 8.266429e-06 0 Klf5
5.696874e-09 0.6545184 0.908 0.660 1.139375e-05 0 Bhlhe40
2.519064e-06 0.6293090 0.753 0.567 5.038128e-03 0 Nr4a3
5.571794e-06 1.5186129 0.508 0.197 1.114359e-02 0 Lif
9.104491e-06 0.8500509 0.711 0.483 1.820898e-02 0 Lgals1
2.672643e-05 0.7781173 0.520 0.227 5.345286e-02 0 Cxcl2
1.314290e-04 1.0370546 0.617 0.369 2.628580e-01 0 Csf2
9.233518e-16 0.8754994 0.576 0.824 1.846704e-12 1 Cd74
4.967232e-12 0.4584160 0.502 0.386 9.934465e-09 1 Cd2
4.525421e-09 0.3140712 0.448 0.709 9.050843e-06 1 Gem
4.996302e-06 1.0643079 0.611 0.605 9.992604e-03 1 Calca
5.700769e-06 0.7133663 0.635 0.547 1.140154e-02 1 Jun
2.365746e-05 0.2986080 0.532 0.448 4.731493e-02 1 Mxd1
4.904540e-05 0.3072121 0.764 0.921 9.809081e-02 1 Pdcd1
6.993128e-05 0.5535503 0.803 0.767 1.398626e-01 1 Lmo4
8.141470e-04 0.3289924 0.685 0.612 1.000000e+00 1 Cdkn1a
4.402136e-03 0.8227895 0.872 0.908 1.000000e+00 1 Lars2

7.4.4 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

There are not many significant differences between the two ILC clusters. Instead, we define these cells based on expression of their transcription factor expression for Gata3 and Klrg1.

RidgePlot(ILCs, features=c("Gata3","Eomes","Rorc"), assay="RNA")
## Picking joint bandwidth of 0.252
## Picking joint bandwidth of 2.73e-06
## Picking joint bandwidth of 2.83e-06

ILCs <- RenameIdents(ILCs,
                               '0' = "ILC2s",
                               '1' = "ILC2s")
ILCs$highlevel <- Idents(ILCs)

7.4.5 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(ILCs, reduction="umap", group.by="highlevel", label=TRUE) + NoLegend() + labs(title="Type 2 Innate Lymphocytes")

7.5 Macrophages and Monocytes

7.5.1 PCA and clustering

Run PCA and determine dimensions for clustering.

MM <- subset(data.integrated, idents = c("Macrophages","Monocytes"))
MM <- RunPCA(MM, features=VariableFeatures(MM),dims=1:30, verbose=FALSE)
ElbowPlot(MM, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

MM <- FindNeighbors(MM, dims=1:25, verbose=FALSE) 
MM <- FindClusters(MM, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(MM, prefix="integrated_snn_res.") 

# Choose resolution
MM <- FindClusters(MM, resolution=0.25, verbose=FALSE) 

7.5.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
MM <- RunUMAP(MM, dims=1:25, verbose=FALSE) 

# Plot UMAP
DimPlot(MM, reduction="umap", label=TRUE) + NoLegend() 

7.5.3 Find Subcluster Markers

Find differentially expressed markers between the subclusters.

# Find Markers
MM.markers <- FindAllMarkers(MM, only.pos = TRUE, min.pct = 0.50, logfc.threshold = 0.25, max.cells.per.ident=500, verbose=FALSE) 

# List top 10 genes for each subcluster in a table
MM.markers.top10 <- MM.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC)
MM.markers.top10 %>% formattable()
p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
2.631330e-59 0.9630178 0.999 0.871 5.262660e-56 0 C1qa
5.405971e-56 1.2225347 0.994 0.945 1.081194e-52 0 H2-Eb1
2.336610e-55 1.0472063 0.995 0.976 4.673221e-52 0 Cd74
3.291554e-53 1.0877416 0.993 0.960 6.583108e-50 0 H2-Aa
1.892834e-45 1.0362121 0.995 0.953 3.785667e-42 0 H2-Ab1
7.552646e-45 1.0880559 0.998 0.908 1.510529e-41 0 Selenop
9.242817e-44 1.2204447 0.975 0.870 1.848563e-40 0 Atf3
8.105603e-43 1.2342173 0.998 0.947 1.621121e-39 0 Apoe
2.710674e-36 1.0096261 0.998 0.883 5.421348e-33 0 Pltp
5.138933e-16 1.0212269 0.935 0.822 1.027787e-12 0 Cxcl2
3.912048e-128 2.9259917 0.980 0.669 7.824097e-125 1 Cbr2
5.188782e-117 2.1862167 0.985 0.821 1.037756e-113 1 F13a1
1.425706e-108 2.0701413 0.976 0.617 2.851412e-105 1 Folr2
1.603882e-98 2.0916825 0.943 0.580 3.207764e-95 1 Clec10a
2.015713e-89 2.2432107 0.986 0.883 4.031425e-86 1 Wfdc17
2.971004e-81 2.1934916 0.878 0.530 5.942009e-78 1 Ltc4s
2.949074e-76 2.8381634 0.925 0.671 5.898148e-73 1 Ccl8
9.556757e-75 2.4802099 0.814 0.330 1.911351e-71 1 Fcna
1.671907e-57 2.9411159 0.749 0.403 3.343813e-54 1 Cd209f
2.459026e-28 1.9921938 0.748 0.611 4.918051e-25 1 Ccl7
3.197622e-71 1.5063018 0.900 0.634 6.395245e-68 2 Ccr2
4.004037e-47 1.0832322 0.871 0.647 8.008074e-44 2 Plbd1
1.505326e-39 0.9846393 0.977 0.931 3.010651e-36 2 Prdx5
3.971703e-28 0.9824928 0.734 0.601 7.943406e-25 2 Fn1
4.133601e-21 1.3153343 0.778 0.698 8.267202e-18 2 Il1b
1.264890e-18 0.9713420 0.663 0.601 2.529780e-15 2 Clec4e
1.935968e-12 1.2718399 0.784 0.750 3.871937e-09 2 Gbp2
1.690806e-09 1.2614341 0.810 0.757 3.381612e-06 2 AW112010
4.086157e-09 2.1010263 0.711 0.668 8.172314e-06 2 Cxcl9
1.396295e-04 1.4252503 0.719 0.800 2.792589e-01 2 Lyz1
1.881292e-59 1.8392096 0.769 0.214 3.762584e-56 3 Igkv3-2
2.723483e-58 2.3264047 0.751 0.275 5.446965e-55 3 Igkv8-30
2.345015e-26 1.0018333 0.941 0.842 4.690030e-23 3 AY036118
8.629598e-21 0.9260990 0.359 0.551 1.725920e-17 3 Nkg7
5.544459e-18 0.8619105 0.917 0.799 1.108892e-14 3 Fabp5
3.827955e-16 0.6961475 0.310 0.548 7.655909e-13 3 Ms4a4b
9.640344e-14 1.3104787 0.653 0.457 1.928069e-10 3 Igkc
4.467950e-10 1.2432473 0.350 0.527 8.935901e-07 3 Gm26917
4.044064e-08 1.4551220 0.599 0.548 8.088129e-05 3 Ccl5
2.645991e-03 0.7386485 0.586 0.473 1.000000e+00 3 S100a8
8.943266e-128 2.0461277 0.999 0.966 1.788653e-124 4 Lgals3
5.623202e-124 1.9431574 0.997 0.807 1.124640e-120 4 Trem2
2.214393e-115 2.0833538 0.997 0.844 4.428787e-112 4 Lpl
3.119676e-100 1.9554032 0.996 0.866 6.239353e-97 4 Anxa1
2.480309e-97 2.3117984 0.990 0.762 4.960618e-94 4 Fabp4
2.405157e-91 2.7196455 0.875 0.477 4.810315e-88 4 Gpnmb
5.494707e-79 1.9254009 0.977 0.797 1.098941e-75 4 Fabp5
1.103309e-76 2.0423153 0.889 0.677 2.206618e-73 4 Gdf15
1.176345e-50 2.2848724 0.786 0.524 2.352689e-47 4 Ctsk
1.445354e-39 2.3279880 0.710 0.426 2.890707e-36 4 Mmp12
1.265259e-140 2.2905049 0.999 0.925 2.530519e-137 5 Ifitm3
7.766202e-131 3.7099677 0.963 0.515 1.553240e-127 5 Plac8
2.540404e-123 2.8412764 0.971 0.666 5.080809e-120 5 Msrb1
5.011354e-117 2.4234060 0.957 0.593 1.002271e-113 5 Napsa
1.139149e-112 2.5857945 0.919 0.472 2.278298e-109 5 Ifitm6
2.598845e-92 2.8852511 0.933 0.659 5.197689e-89 5 Gngt2
4.545875e-87 2.5410988 0.829 0.395 9.091749e-84 5 Pglyrp1
5.287463e-81 2.1862834 0.888 0.532 1.057493e-77 5 Ly6c2
1.390062e-57 2.2275482 0.771 0.416 2.780124e-54 5 Hp
3.454843e-24 2.2385940 0.651 0.469 6.909686e-21 5 Ace
8.221647e-128 2.5131059 0.956 0.366 1.644329e-124 6 Birc5
3.132663e-123 2.7377913 0.983 0.671 6.265326e-120 6 Hmgb2
2.014025e-117 2.7327736 0.957 0.553 4.028050e-114 6 Stmn1
3.899257e-112 2.3520757 0.916 0.409 7.798513e-109 6 Pclaf
5.678232e-107 2.0549068 0.918 0.426 1.135646e-103 6 Top2a
2.176883e-95 2.5938169 0.890 0.354 4.353765e-92 6 Ube2c
1.215893e-93 1.9752420 0.989 0.897 2.431786e-90 6 Tuba1b
5.895895e-93 2.1435189 0.959 0.820 1.179179e-89 6 Tubb5
1.282352e-86 3.2727285 0.881 0.457 2.564703e-83 6 Hist1h2ap
1.437759e-46 3.3273914 0.643 0.258 2.875517e-43 6 Igkv3-2
3.443472e-163 5.8665551 0.994 0.208 6.886943e-160 7 Cma1
2.221318e-161 5.7783622 0.990 0.109 4.442637e-158 7 Mcpt4
6.391612e-158 5.0252377 0.990 0.255 1.278322e-154 7 Tpsb2
9.625010e-139 2.9099955 0.949 0.035 1.925002e-135 7 Fcer1a
1.125348e-99 4.2606542 0.884 0.139 2.250695e-96 7 Cpa3
2.845058e-47 2.4843559 0.767 0.314 5.690116e-44 7 Hdc
4.650186e-45 2.1661052 0.798 0.671 9.300372e-42 7 Ly6a
4.069314e-43 3.2920822 0.751 0.258 8.138627e-40 7 Jchain
6.210268e-41 2.4265468 0.556 0.024 1.242054e-37 7 Igkv4-55
1.419138e-04 1.7986944 0.559 0.489 2.838275e-01 7 Serpinb1a
1.203595e-95 1.5025784 0.955 0.509 2.407190e-92 8 Gmnn
1.185826e-93 1.3830646 0.930 0.535 2.371651e-90 8 Mcm5
1.104388e-86 1.3561979 0.921 0.463 2.208775e-83 8 Mcm6
2.920161e-84 1.3330546 0.921 0.558 5.840321e-81 8 Mcm3
1.381643e-77 1.3210080 0.896 0.405 2.763287e-74 8 Tk1
9.176511e-77 1.2694331 0.966 0.570 1.835302e-73 8 Stmn1
4.034621e-76 1.2219774 0.902 0.518 8.069242e-73 8 Dut
9.377592e-66 1.1831015 0.963 0.684 1.875518e-62 8 Pcna
4.461093e-55 1.2638084 0.933 0.548 8.922186e-52 8 Ccnd1
2.262889e-23 1.2141170 0.803 0.576 4.525778e-20 8 Mgl2
3.625103e-86 4.3272386 0.995 0.617 7.250205e-83 9 Fn1
2.163070e-80 6.1117747 0.917 0.275 4.326139e-77 9 Saa3
5.878531e-79 2.8548741 0.917 0.248 1.175706e-75 9 Serpinb2
8.436965e-64 2.8345475 0.966 0.785 1.687393e-60 9 Lyz1
1.317519e-52 6.4322393 0.848 0.463 2.635038e-49 9 Slpi
1.504247e-48 3.9496981 0.809 0.308 3.008494e-45 9 Prg4
1.810113e-41 3.6897817 0.775 0.261 3.620225e-38 9 Cxcl13
1.878731e-30 3.6166782 0.681 0.026 3.757462e-27 9 Alox15
4.505472e-28 2.7179290 0.740 0.376 9.010943e-25 9 Cd5l
1.720072e-09 3.0364026 0.711 0.419 3.440143e-06 9 Igkv14-126

7.5.4 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

Macrophages were by far the largest population and some other cell types seem to have snuck in by Louvain clustering. The high levels of local lipid may induce transcriptional profiles that cause different cell types to converge. For example, population 6 are Mast Cells unique to WC expressing high levels of Mcpt4 and Cma1, but they can be found near the lipid-associated population.

MM <- RenameIdents(MM,
                               '0' = "Tissue Resident Macrophages",
                               '1' = "Tissue Resident Macrophages",
                               '2' = "Classical Monocytes",
                               '3' = "LAMs",
                               '4' = "LAMs",
                               '5' = "Non-classical Monocytes",
                               '6' = "Proliferating Macrophages",
                               '7' = "Mast Cells",
                               '8' = "LAMs",
                               '9' = "Apoptotic Cell Clearance Macrophages")
MM$highlevel <- Idents(MM)

7.5.5 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(MM, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="Monocytes and Macrophages")

7.6 Stromal Cells

7.6.1 PCA and clustering

Run PCA and determine dimensions for clustering.

Stromal <- subset(data.integrated, idents = c("Stromal Cells"))
Stromal <- RunPCA(Stromal, features=VariableFeatures(Stromal),dims=1:30, verbose=FALSE)
ElbowPlot(Stromal, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

Stromal <- FindNeighbors(Stromal, dims=1:10, verbose=FALSE) 
Stromal <- FindClusters(Stromal, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(Stromal, prefix="integrated_snn_res.") 

# Choose resolution
Stromal <- FindClusters(Stromal, resolution=0.2, verbose=FALSE) 

7.6.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
Stromal <- RunUMAP(Stromal, dims=1:10, verbose=FALSE) 

# Plot UMAP
DimPlot(Stromal, reduction="umap", label=TRUE) + NoLegend() 

7.6.3 Find Subcluster Markers

Find differentially expressed markers between the subclusters.

# Find Markers
Stromal.markers <- FindAllMarkers(Stromal, only.pos = TRUE, min.pct = 0.50, logfc.threshold = 0.25, max.cells.per.ident=500, verbose=FALSE) 

# List top 10 genes for each subcluster in a table
Stromal.markers.top10 <- Stromal.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_log2FC)
Stromal.markers.top10 %>% formattable()
p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
4.003854e-109 0.7824967 0.974 0.541 8.007708e-106 0 Il4i1
5.453985e-77 0.7784990 0.982 0.466 1.090797e-73 0 Thbs1
4.304945e-70 0.8353014 0.947 0.427 8.609889e-67 0 Jchain
3.279298e-40 0.9020123 0.986 0.702 6.558595e-37 0 Ccl7
1.569363e-32 0.6586312 0.992 0.773 3.138725e-29 0 F3
5.324201e-29 0.6494192 0.733 0.303 1.064840e-25 0 Mmp3
1.538051e-27 0.6526392 0.733 0.373 3.076101e-24 0 Ccr7
1.954126e-24 0.6663077 0.983 0.739 3.908251e-21 0 Slfn5
5.494805e-22 0.8775580 0.997 0.879 1.098961e-18 0 Sparcl1
7.433933e-11 0.6959370 0.994 0.900 1.486787e-07 0 Col6a2
6.734413e-68 1.3242331 0.966 0.961 1.346883e-64 1 Plpp3
3.950014e-56 1.1455703 1.000 0.980 7.900027e-53 1 Gsn
1.242155e-43 1.3755558 0.932 0.944 2.484310e-40 1 Penk
4.681528e-42 1.0287474 0.936 0.688 9.363057e-39 1 Col1a2
1.486163e-40 1.3298716 0.853 0.953 2.972327e-37 1 Inmt
1.599026e-38 0.9731856 0.985 0.952 3.198053e-35 1 Ly6a
4.883237e-38 1.1262370 0.910 0.903 9.766473e-35 1 Gas1
6.907412e-27 1.6075051 0.814 0.936 1.381482e-23 1 Sult1e1
1.760261e-23 1.4116822 0.791 0.901 3.520522e-20 1 C7
1.237883e-10 1.0099330 0.620 0.787 2.475766e-07 1 Sfrp1
1.068350e-39 1.5442463 0.962 0.787 2.136699e-36 2 Cd74
3.540902e-36 1.5829035 0.924 0.693 7.081804e-33 2 Lyz2
2.473350e-35 1.4816341 0.949 0.773 4.946701e-32 2 H2-Aa
6.119473e-31 1.7894670 0.802 0.431 1.223895e-27 2 Retnla
7.931502e-26 1.6347582 0.857 0.638 1.586300e-22 2 Ccl6
1.007163e-21 1.8233884 0.844 0.734 2.014326e-18 2 Lgals3
4.411551e-19 1.5015505 0.819 0.661 8.823103e-16 2 Ctss
8.801353e-18 1.4959178 0.814 0.621 1.760271e-14 2 C1qb
1.348989e-16 2.4073080 0.511 0.683 2.697978e-13 2 Ighv1-55
1.052640e-12 4.3250061 0.654 0.329 2.105279e-09 2 Igkv3-2
3.984638e-59 2.7087855 0.986 0.294 7.969276e-56 3 Clu
1.519528e-51 1.5003904 0.972 0.651 3.039057e-48 3 Lcn2
2.181579e-49 1.4827859 0.937 0.067 4.363158e-46 3 Nkain4
7.805556e-39 4.4619050 0.155 0.669 1.561111e-35 3 Igkv2-109
6.103988e-36 1.4528182 0.930 0.594 1.220798e-32 3 Cpe
4.489712e-27 1.4657017 0.986 0.740 8.979423e-24 3 Cfb
2.349547e-23 1.4498941 0.937 0.445 4.699094e-20 3 Gas6
7.349176e-17 1.9534146 0.761 0.211 1.469835e-13 3 Slpi
4.795340e-14 1.4069466 0.732 0.120 9.590680e-11 3 Upk3b
3.301756e-03 1.4450585 0.599 0.526 1.000000e+00 3 Lgals7
1.601638e-51 1.5973540 0.975 0.723 3.203277e-48 4 Ackr3
3.648100e-41 1.8586094 0.966 0.837 7.296199e-38 4 Tmem100
5.101250e-41 2.9678236 0.899 0.298 1.020250e-37 4 Pi16
1.671409e-39 1.7129492 0.992 0.840 3.342818e-36 4 Gfpt2
4.018832e-36 1.9455177 0.916 0.308 8.037663e-33 4 Cd55
3.671642e-34 1.6765485 0.983 0.663 7.343283e-31 4 Pcolce2
9.832611e-33 1.8467213 0.992 0.902 1.966522e-29 4 Cd248
4.562555e-30 2.0264437 1.000 0.855 9.125111e-27 4 Fn1
9.205025e-30 1.6847834 1.000 0.922 1.841005e-26 4 Cd34
2.319040e-18 1.7262968 0.832 0.540 4.638081e-15 4 Fbn1
1.336985e-41 3.4858749 1.000 0.145 2.673970e-38 5 Krt19
8.047718e-41 2.9181071 1.000 0.189 1.609544e-37 5 Gpm6a
1.058012e-28 3.7811814 1.000 0.149 2.116024e-25 5 Upk3b
5.116085e-28 3.8237171 1.000 0.116 1.023217e-24 5 Nkain4
3.198906e-27 2.9349714 1.000 0.338 6.397811e-24 5 Fmod
4.054138e-27 4.0421373 1.000 0.334 8.108276e-24 5 Clu
5.185964e-27 2.9485761 1.000 0.131 1.037193e-23 5 Atp1b1
9.812781e-27 2.9220513 1.000 0.720 1.962556e-23 5 Csrp2
2.130489e-26 3.3088363 1.000 0.913 4.260977e-23 5 Efemp1
1.192224e-25 3.0903557 1.000 0.471 2.384448e-22 5 Gas6

7.6.4 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

Stromal <- RenameIdents(Stromal,
                               '0' = "Adipocyte Precursor Cells",
                               '1' = "Adipocyte Precursor Cells",
                               '2' = "Adipocyte Precursor Cells",
                               '3' = "Adipocyte Precursor Cells",
                               '4' = "Adipocyte Precursor Cells",
                               '5' = "Mesothelial-like Cells")
Stromal$highlevel <- Idents(Stromal)

7.6.5 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(Stromal, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="Stromal Cells")

7.7 Mast Cells

7.7.1 PCA and clustering

Run PCA and determine dimensions for clustering.

Mast <- subset(data.integrated, idents = c("Mast Cells"))
Mast <- RunPCA(Mast, features=VariableFeatures(Mast),dims=1:30, verbose=FALSE)
ElbowPlot(Mast, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

Mast <- FindNeighbors(Mast, dims=1:10, verbose=FALSE) 
Mast <- FindClusters(Mast, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(Mast, prefix="integrated_snn_res.") 

# Choose resolution
Mast <- FindClusters(Mast, resolution=0.2, verbose=FALSE) 

7.7.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
Mast <- RunUMAP(Mast, dims=1:10, verbose=FALSE) 

# Plot UMAP
DimPlot(Mast, reduction="umap", label=TRUE) + NoLegend() 

7.7.3 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

Mast <- RenameIdents(Mast, 
                     '0' = "Mast Cells",
                     '1' = "Mast Cells")
Mast$highlevel <- Idents(Mast)

7.7.4 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(Mast, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="Mast Cells")

7.8 Neutrophils

7.8.1 PCA and clustering

Run PCA and determine dimensions for clustering.

Neutrophils <- subset(data.integrated, idents = c("Neutrophils"))
Neutrophils <- RunPCA(Neutrophils, features=VariableFeatures(Neutrophils),dims=1:30, verbose=FALSE)
ElbowPlot(Neutrophils, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

Neutrophils <- FindNeighbors(Neutrophils, dims=1:10, verbose=FALSE) 
Neutrophils <- FindClusters(Neutrophils, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(Neutrophils, prefix="integrated_snn_res.") 

# Choose resolution
Neutrophils <- FindClusters(Neutrophils, resolution=0.2, verbose=FALSE) 

7.8.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
Neutrophils <- RunUMAP(Neutrophils, dims=1:10, verbose=FALSE) 

# Plot UMAP
DimPlot(Neutrophils, reduction="umap", label=TRUE) + NoLegend() 

7.8.3 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

Neutrophils <- RenameIdents(Neutrophils, '0' = "Neutrophils")
Neutrophils$highlevel <- Idents(Neutrophils)

7.8.4 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(Neutrophils, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="Neutrophils")

7.9 Endothelial Cells

7.9.1 PCA and clustering

Run PCA and determine dimensions for clustering.

ECs <- subset(data.integrated, idents = c("Endothelial Cells"))
ECs <- RunPCA(ECs, features=VariableFeatures(ECs),dims=1:30, verbose=FALSE)
ElbowPlot(ECs, ndims=30, reduction="pca")

Generate a clustering tree and determine clustering resolution for subclusters.

ECs <- FindNeighbors(ECs, dims=1:10, verbose=FALSE) 
ECs <- FindClusters(ECs, resolution=res, verbose=FALSE) 

# Clustering Tree
clustree(ECs, prefix="integrated_snn_res.") 

# Choose resolution
ECs <- FindClusters(ECs, resolution=0.3, verbose=FALSE) 

7.9.2 Dimensional Reduction

Plot the dimensional reduction for the new cell subclusters.

# Run UMAP
ECs <- RunUMAP(ECs, dims=1:10, verbose=FALSE) 

# Plot UMAP
DimPlot(ECs, reduction="umap", label=TRUE) + NoLegend() 

7.9.3 Rename subcluster identities

Using the information from SingleR and differential expression, assign annotations to the subclusters.

ECs <- RenameIdents(ECs, '0' = "Endothelial Cells")
ECs$highlevel <- Idents(ECs)

7.9.4 Final Dim Plot

As a final check, take a look at the Dim Plot for the subclusters and confirm that it makes sense.

DimPlot(ECs, reduction="umap", group.by="highlevel", label=TRUE, repel=TRUE) + NoLegend() + labs(title="Endothelial Cells")

7.10 Store the subcluster identities

Once we have renamed our cell subclusters, we have to merge the subsetted data and then add this information to our original integrated Seurat object.

remerge <- merge(NKT, y=c(DCs, Bcells, ILCs, MM, Stromal, Mast, Neutrophils, ECs)) 
data.integrated <- AddMetaData(data.integrated, remerge$highlevel, col.name= "highlevel")

7.11 Correct identities

7.11.1 Endothelial to stromal

In later analysis, we reclassified endothelial cells as part of the major stromal cell cluster. To separate these changes from our original proprocessing, I designated these identities as lowlevel2

Idents(data.integrated) <- data.integrated$lowlevel
data.integrated <- RenameIdents(data.integrated, "Endothelial Cells" = "Stromal Cells")
data.integrated$lowlevel2 <- Idents(data.integrated)
data.integrated$lowlevel2 <- factor(data.integrated$lowlevel2, levels=c("Macrophages","Monocytes","Dendritic Cells","T Cells","NK Cells","B Cells","Plasma Cells","Mast Cells","Neutrophils","ILCs","Stromal Cells"))

7.11.2 Adjust highlevel subclusters

We also renamed some of the subclusters to clarify what these cell types are. Again, we gave these renamed clusters their own designation to avoid confusion with our preprocessing pipeline and called them highlevel2.

Idents(data.integrated) <- data.integrated$highlevel
data.integrated <- RenameIdents(data.integrated,
                                "Apoptotic Cell Clearance Macrophages" = "Efferocytes",
                                "gd T cells" = "gd T Cells",
                                "Proliferating Macrophages" = "Cycling Macrophages",
                                "Proliferating cDC2" = "Cycling cDC2",
                                "Proliferating cDC1" = "Cycling cDC1",
                                "Proliferating CD8+ T Cells" = "Cycling CD8+ T Cells",
                                "Effector CD8+ T Cells" = "Effector Memory CD8+ T Cells",
                                "Memory CD8+ T Cells" = "Central Memory CD8+ T Cells")
data.integrated$highlevel2 <- Idents(data.integrated)

7.11.3 Correct lowlevel and highlevel designations

For robustness, we also made sure that cells that are macrophage subclusters should be under the macrophage umbrella. This fixes some problems noted above, such as a mast cell population that was originally clustered in with the macrophages.

Idents(data.integrated) <- data.integrated$highlevel2

data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Tissue Resident Macrophages",
                                                               "LAMs",
                                                               "Cycling Macrophages",
                                                               "Efferocytes"))] <- "Macrophages"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Classical Monocytes",
                                                               "Non-classical Monocytes"))] <- "Monocytes"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("cDC1",
                                                               "Activated cDC1","Cycling cDC1",
                                                               "cDC2",
                                                               "Activated cDC2",
                                                               "Cycling cDC2",
                                                               "pDCs",
                                                               "moDCs"))] <- "Dendritic Cells"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Th1 CD4+ T Cells",
                                                               "Th2 CD4+ T Cells",
                                                               "Tregs",
                                                               "Effector Memory CD8+ T Cells",
                                                               "Central Memory CD8+ T Cells",
                                                               "Cycling CD8+ T Cells",
                                                               "gd T Cells"))] <- "T Cells"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("NKT Cells",
                                                               "NK Cells"))] <- "NK Cells"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Naive B Cells",
                                                               "Memory B Cells",
                                                               "B-1a Cells"))] <- "B Cells"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Plasma Cells"))] <- "Plasma Cells"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Mast Cells"))] <- "Mast Cells"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Neutrophils"))] <- "Neutrophils"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("ILC2s"))] <- "ILCs"
data.integrated$lowlevel2[WhichCells(data.integrated, idents=c("Endothelial Cells",
                                                               "Adipocyte Precursor Cells",
                                                               "Mesothelial-like Cells"))] <- "Stromal Cells"

7.11.4 Reorder the identities

Finally, we reorder our identities, which makes downstream plotting much easier.

#Order levels of data.integrated
data.integrated$highlevel2 <- factor(data.integrated$highlevel2, 
                                     levels=c("Tissue Resident Macrophages",
                                              "LAMs",
                                              "Cycling Macrophages",
                                              "Efferocytes",
                                              "Classical Monocytes",
                                              "Non-classical Monocytes",
                                              "cDC1",
                                              "Activated cDC1",
                                              "Cycling cDC1",
                                              "cDC2",
                                              "Activated cDC2",
                                              "Cycling cDC2",
                                              "pDCs",
                                              "moDCs",
                                              "Th1 CD4+ T Cells",
                                              "Th2 CD4+ T Cells",
                                              "Tregs",
                                              "Effector Memory CD8+ T Cells",
                                              "Central Memory CD8+ T Cells",
                                              "Cycling CD8+ T Cells",
                                              "gd T Cells",
                                              "NK Cells",
                                              "NKT Cells",
                                              "Naive B Cells",
                                              "Memory B Cells",
                                              "B-1a Cells",
                                              "Plasma Cells",
                                              "Mast Cells",
                                              "Neutrophils",
                                              "ILC2s",
                                              "Endothelial Cells",
                                              "Adipocyte Precursor Cells",
                                              "Mesothelial-like Cells"))

8 Plot Cluster Annotations

8.1 Colors

First, assign the colors that we use throughout the study.

maccols <- brewer.pal(n=8, name="Blues")[c(-1,-3,-5,-7)]
monocols <- c("#ff8ade","#e324ad")
dccols <- brewer.pal(n=9, name="Greens")[-1]
tcols <- brewer.pal(n=8, name="Reds")[-1]
nkcols <- c("#876149","#6e3f22")
bcols <- brewer.pal(n=4, name="Purples")[-1]
othcols <- c("#71a157","#00c5cc","#a18e25","#b282b3")
strcols <- brewer.pal(n=4, name="Oranges")[-1]
wccols = c("#878787", "#518db6","#94cc73","#e96b53")

cols <- c(maccols,monocols,dccols,tcols,nkcols,bcols,othcols,strcols)

llcols <- c("#4292C6","#ff8ade","#238B45","#EF3B2C","#876149","#9E9AC8","#71a157","#00c5cc","#a18e25","#b282b3","#FD8D3C")

8.2 Plot Cell Types

DimPlot(data.integrated, group.by="lowlevel2", cols=llcols, label=T) + 
  NoLegend() + 
  labs(title="Cell Types")

8.3 Plot Subclusters

DimPlot(data.integrated, group.by="highlevel2", cols=cols) + 
  theme(legend.position="bottom", legend.text=element_text(size=7)) + 
  labs(title="Subclusters")

9 Save Progress

saveRDS(data.integrated, file="IntegratedData.RDS")

10 Session Info

sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] celldex_1.2.0               formattable_0.2.1          
##  [3] SingleR_1.6.1               SummarizedExperiment_1.22.0
##  [5] Biobase_2.52.0              GenomicRanges_1.44.0       
##  [7] GenomeInfoDb_1.28.0         IRanges_2.26.0             
##  [9] S4Vectors_0.30.0            BiocGenerics_0.38.0        
## [11] MatrixGenerics_1.4.0        matrixStats_0.59.0         
## [13] clustree_0.4.3              ggraph_2.0.5               
## [15] RColorBrewer_1.1-2          plotly_4.9.4.1             
## [17] SeuratWrappers_0.3.0        cowplot_1.1.1              
## [19] dplyr_1.0.7                 ggplot2_3.3.5              
## [21] SeuratObject_4.0.2          Seurat_4.0.3               
## 
## loaded via a namespace (and not attached):
##   [1] utf8_1.2.1                    reticulate_1.20              
##   [3] R.utils_2.10.1                tidyselect_1.1.1             
##   [5] RSQLite_2.2.7                 AnnotationDbi_1.54.1         
##   [7] htmlwidgets_1.5.3             grid_4.1.0                   
##   [9] BiocParallel_1.26.0           Rtsne_0.15                   
##  [11] munsell_0.5.0                 ScaledMatrix_1.0.0           
##  [13] codetools_0.2-18              ica_1.0-2                    
##  [15] future_1.21.0                 miniUI_0.1.1.1               
##  [17] withr_2.4.2                   colorspace_2.0-2             
##  [19] filelock_1.0.2                highr_0.9                    
##  [21] knitr_1.33                    SingleCellExperiment_1.14.1  
##  [23] ROCR_1.0-11                   tensor_1.5                   
##  [25] listenv_0.8.0                 labeling_0.4.2               
##  [27] GenomeInfoDbData_1.2.6        polyclip_1.10-0              
##  [29] bit64_4.0.5                   farver_2.1.0                 
##  [31] parallelly_1.27.0             vctrs_0.3.8                  
##  [33] generics_0.1.0                xfun_0.24                    
##  [35] BiocFileCache_2.0.0           R6_2.5.0                     
##  [37] graphlayouts_0.7.1            rsvd_1.0.5                   
##  [39] bitops_1.0-7                  spatstat.utils_2.2-0         
##  [41] cachem_1.0.5                  DelayedArray_0.18.0          
##  [43] assertthat_0.2.1              promises_1.2.0.1             
##  [45] scales_1.1.1                  gtable_0.3.0                 
##  [47] beachmat_2.8.0                globals_0.14.0               
##  [49] goftest_1.2-2                 tidygraph_1.2.0              
##  [51] rlang_0.4.11                  splines_4.1.0                
##  [53] lazyeval_0.2.2                spatstat.geom_2.2-2          
##  [55] checkmate_2.0.0               BiocManager_1.30.16          
##  [57] yaml_2.2.1                    reshape2_1.4.4               
##  [59] abind_1.4-5                   backports_1.2.1              
##  [61] httpuv_1.6.1                  tools_4.1.0                  
##  [63] ellipsis_0.3.2                spatstat.core_2.3-0          
##  [65] jquerylib_0.1.4               ggridges_0.5.3               
##  [67] Rcpp_1.0.7                    plyr_1.8.6                   
##  [69] sparseMatrixStats_1.4.0       zlibbioc_1.38.0              
##  [71] purrr_0.3.4                   RCurl_1.98-1.3               
##  [73] rpart_4.1-15                  deldir_0.2-10                
##  [75] pbapply_1.4-3                 viridis_0.6.1                
##  [77] zoo_1.8-9                     ggrepel_0.9.1                
##  [79] cluster_2.1.2                 magrittr_2.0.1               
##  [81] RSpectra_0.16-0               data.table_1.14.0            
##  [83] scattermore_0.7               lmtest_0.9-38                
##  [85] RANN_2.6.1                    fitdistrplus_1.1-5           
##  [87] patchwork_1.1.1               mime_0.11                    
##  [89] evaluate_0.14                 xtable_1.8-4                 
##  [91] gridExtra_2.3                 compiler_4.1.0               
##  [93] tibble_3.1.2                  KernSmooth_2.23-20           
##  [95] crayon_1.4.1                  R.oo_1.24.0                  
##  [97] htmltools_0.5.1.1             mgcv_1.8-36                  
##  [99] later_1.2.0                   tidyr_1.1.3                  
## [101] DBI_1.1.1                     ExperimentHub_2.0.0          
## [103] tweenr_1.0.2                  dbplyr_2.1.1                 
## [105] rappdirs_0.3.3                MASS_7.3-54                  
## [107] Matrix_1.3-4                  R.methodsS3_1.8.1            
## [109] igraph_1.2.6                  pkgconfig_2.0.3              
## [111] spatstat.sparse_2.0-0         bslib_0.2.5.1                
## [113] XVector_0.32.0                stringr_1.4.0                
## [115] digest_0.6.27                 sctransform_0.3.2            
## [117] RcppAnnoy_0.0.18              spatstat.data_2.1-0          
## [119] Biostrings_2.60.1             rmarkdown_2.9                
## [121] leiden_0.3.8                  uwot_0.1.10                  
## [123] DelayedMatrixStats_1.14.0     curl_4.3.2                   
## [125] shiny_1.6.0                   lifecycle_1.0.0              
## [127] nlme_3.1-152                  jsonlite_1.7.2               
## [129] BiocNeighbors_1.10.0          limma_3.48.0                 
## [131] viridisLite_0.4.0             fansi_0.5.0                  
## [133] pillar_1.6.1                  lattice_0.20-44              
## [135] KEGGREST_1.32.0               fastmap_1.1.0                
## [137] httr_1.4.2                    survival_3.2-11              
## [139] interactiveDisplayBase_1.30.0 glue_1.4.2                   
## [141] remotes_2.4.0                 png_0.1-7                    
## [143] BiocVersion_3.13.1            bit_4.0.4                    
## [145] ggforce_0.3.3                 stringi_1.7.3                
## [147] sass_0.4.0                    blob_1.2.1                   
## [149] BiocSingular_1.8.1            AnnotationHub_3.0.1          
## [151] memoise_2.0.0                 irlba_2.3.3                  
## [153] future.apply_1.7.0